In the previous chapters we have often discussed the powerful concept of looping in Python. Using loops, we can easily repeat certain actions when coding. With for-loops, for instance, it is really easy to visit the items in a list in a list and print them for example. In this chapter, we will discuss some more advanced forms of looping, as well as new, quick ways to create and deal with lists and other data sequences.
The first new function that we will discuss here is range()
. Using this function, we can quickly generate a list of numbers in a specific range:
In [ ]:
for i in range(10):
print(i)
Here, range()
will return a number of integers, starting from zero, up to (but not including) the number which we pass as an argument to the function. Using range()
is of course much more convenient to generate such lists of numbers than writing e.g. a while-loop to achieve the same result. Note that we can pass more than one argument to range()
, if we want to start counting from a number higher than zero (which will be the default when you only pass a single parameter to the function):
In [ ]:
for i in range(300, 306):
print(i)
We can even specify a 'step size' as a third argument, which controls how much a variable will increase with each step:
In [ ]:
for i in range(15, 26, 3):
print(i)
If you don't specify the step size explicitly, it will default to 1. If you want to store or print the result of calling range()
, you have to cast it explicitly, for instance, to a list:
In [ ]:
numbers = list(range(10))
print(numbers[3:])
Of course, range()
can also be used to iterate over the items in a list or tuple, typically in combination with calling len()
to avoid IndexErrors
:
In [ ]:
words = "Be yourself; everyone else is already taken".split()
for i in range(len(words)):
print(words[i])
Naturally, the same result can just as easily be obtained using a for-loop:
In [ ]:
for word in words:
print(word)
One drawback of such an easy-to-write loop, however, is that it doesn't keep track of the index of the word that we are printing in one of the iterations. Suppose that we would like to print the index of each word in our example above, we would then have to work with a counter...
In [ ]:
counter = 0
for word in words:
print(word, ": index", counter)
counter+=1
... or indeed use a call to range()
and len()
:
In [ ]:
for i in range(len(words)):
print(words[i], ": index", i)
A function that makes life in Python much easier in this respect is enumerate()
. If we pass a list to enumerate()
, it will return a list of mini-tuples: each mini-tuple will contain as its first element the indices of the items, and as second element the actual item:
In [ ]:
print(list(enumerate(words)))
Here -- as with range()
-- we have to cast the result of enumerate()
to e.g. a list before we can actually print it. Iterating over the result of enumerate()
, on the other hand, is not a problem. Here, we print out each mini-tuple, consisting of an index and an time in a for-loop:
In [ ]:
for mini_tuple in enumerate(words):
print(mini_tuple)
When using such for-loops and enumerate()
, we can do something really cool. Remember that we can 'unpack' tuples: if a tuple consists of two elements, we can unpack it on one line of code to two different variables via the assignment operator:
In [ ]:
item = (5, 'already')
index, word = item # this is the same as: index, word = (5, "already")
print(index)
print(word)
In our for-loop example, we can apply the same kind of unpacking in each iteration:
In [ ]:
for item in enumerate(words):
index, word = item
print(index)
print(word)
print("=======")
However, there is also a super-convenient shortcut for this in Python, where we unpack each item in the for-statement already:
In [ ]:
for index, word in enumerate(words):
print(index)
print(word)
print("====")
How cool is that? Note how easy it becomes now, to solve our problem with the index above:
In [ ]:
for i, word in enumerate(words):
print(word, ": index", i)
Obviously, enumerate()
can be really useful when you're working lists or other kinds of data sequences. Another helpful function in this respect is zip()
. Supposed that we have a small database of 5 books in the forms of three lists: the first list contains the titles of the books, the second the author, while the third list contains the dates of publication:
In [ ]:
titles = ["Emma", "Stoner", "Inferno", "1984", "Aeneid"]
authors = ["J. Austen", "J. Williams", "D. Alighieri", "G. Orwell", "P. Vergilius"]
dates = ["1815", "2006", "Ca. 1321", "1949", "before 19 BC"]
In each of these lists, the third item always corresponds to Dante's masterpiece and the last item to the Aeneid by Vergil, which inspired him. The use of zip()
can now easily be illustrated:
In [ ]:
list(zip(titles, authors))
list(zip(titles, dates))
list(zip(authors, dates))
Do you see what happened here? In fact, zip()
really functions like a 'zipper' in the real-world: it zips together multiple lists, and return a list of mini-tuples, in which the correct authors, titles and dates will be combined with each other. Moreover, you can pass multiple sequences to at once zip()
:
In [ ]:
list(zip(authors, titles, dates))
How awesome is that? Here too: don't forget to cast the result of zip()
to a list or tuple, e.g. if you want to print it. As with enumerate()
we can now also unzip each mini-tuple when declaring a for-loop:
In [ ]:
for author, title in zip(authors, titles):
print(author)
print(title)
print("===")
As you can understand, this is really useful functionality for dealing with long, complex lists and especially combinations of them.
Now it's time to have a look at comprehensions in Python: comprehensions, such a list comprehensions or tuple comprehensions provide an easy way to create and fill new lists. They are also often used to change one list into another. Typically, comprehensions can be written in a single line of Python code, which is why people often feel like they are more readable than normal Python code. Let's start with an example. Say that we would like to fill a list of numbers that represent the length of each word in a sentence, but only if that word isn't a punctuation mark. By now, we can of course easily create such a list using a for-loop:
In [ ]:
import string
words = "I have not failed . I’ve just found 10,000 ways that won’t work .".split()
word_lengths = []
for word in words:
if word not in string.punctuation:
word_lengths.append(len(word))
print(word_lengths)
We can create the exact same list of numbers using a list comprehension which only takes up one line of Python code:
In [ ]:
word_lengths = [len(word) for word in words if word not in string.punctuation]
print(word_lengths)
OK, impressive, but there are a lot of new things going on here. Let's go through this step by step. The first step is easy: we initialize a variable word_lengths
to which we assign a value using the assignment operator. The type of that value will eventually be a list: this is indicated by the square brackets which enclose the list comprehension:
In [ ]:
print(type(word_lengths))
Inside the squared brackets, we can find the actual comprehension which will determine what goes inside our new list. Note that it not always possible to read these comprehensions for left to right, so you will have to get used to the way they are build up from a syntactic point of view. First of all, we add an expression that determines which elements will make it into our list, in this case: len(word)
. The variable word
, in this case, is generated by the following for-statement: for word in words:
. Finally, we add a condition to our statement that will determine whether or not len(word)
should be added to our list. In this case, len(word)
will only be included in our list if the word is not a punctuation mark: if word not in string.punctuation
. This is a full list comprehension, but simpler ones exist. We could for instance not have called len()
on word before appending it to our list. Like this, we could remove, for example, easily remove all punctuation for our wordlist:
In [ ]:
words_without_punc = [word for word in words if word not in string.punctuation]
print(words_without_punc)
Moreover, we don't have to include the if-statement at the end (it is always optional):
In [ ]:
all_word_lengths = [len(word) for word in words]
print(all_word_lengths)
In the comprehensions above, words
is the only pre-existing input to our comprehension; all the other variables are created and manipulated inside the function. The new range()
function which we saw at the beginning of this chapter is also often often as the input for a comprehension:
In [ ]:
square_numbers = [x*x for x in range(10)]
print(square_numbers)
Importantly, we can just as easily create a tuple using the same comprehension syntax, but this time calling tuple()
on the comprehension, instead of using the squared brackets to create a normal list:
In [ ]:
tuple_word_lengths = tuple(len(word) for word in words if word not in string.punctuation)
print(tuple_word_lengths)
print(type(tuple_word_lengths))
This is very useful, especially if you can figure out why the following code block will generate an error...
In [ ]:
tuple_word_lengths = tuple()
for word in words:
if word not in string.punctuation:
tuple_word_lengths.append(len(word))
print(tuple_word_lengths)
Good programmers can do amazing things with comprehensions. With list comprehensions, it becomes really easy, for example, to create nested lists (lists that themselves consist of lists or tuples). Can you figure out what is happening in the following code block:
In [ ]:
nested_list = [[x,x+2] for x in range(10, 22, 3)]
print(nested_list)
print(type(nested_list))
print(type(nested_list[3]))
In the first line above, we create a new list (nested_list
) but we don't fill it with single numbers, but instead with mini-lists that contain two values. We could just as easily have done this with mini-tuples, by using round brackets. Can you spot the differences below?
In [ ]:
nested_tuple = [(x,x+2) for x in range(10, 22, 3)]
print(nested_tuple)
print(type(nested_tuple))
print(type(nested_tuple[3]))
nested_tuple = tuple((x,x+2) for x in range(10, 22, 3))
print(nested_tuple)
print(type(nested_tuple))
print(type(nested_tuple[3]))
Note that zip()
can also be very useful in this respect, because you can unpack items inside the comprehension. Do you understand what is going in the following code block:
In [ ]:
a = [2, 3, 5, 7, 0, 2, 8]
b = [3, 2, 1, 7, 0, 0, 9]
diffs = [a-b for a,b in zip(a, b)]
print(diffs)
Again, more complex comprehensions are thinkable:
In [ ]:
diffs = [abs(a-b) for a,b in zip(a, b) if (a & b)]
print(diffs)
Great: you are starting to become a real pro at comprehensions! The following, very dense code block, however, might be more challenging: can you figure out what is going on?
In [ ]:
A = tuple([x-1,x+3] for x in range(10, 100, 3))
B = [(n*n, n+50) for n in range(10, 1000, 3) if n <= 100]
sums = sum(tuple(item_a[1]+item_b[0] for item_a, item_b in zip(A[:10], B[:10])))
print(sums)
Finally, we should also mention that dictionaries and sets can also be filled in a one-liner using such comprehensions. For sets, the syntax runs entirely parallel to that of list and tuple comprehensions, but here, we use curly brackets to surround the expression:
In [ ]:
text = "This text contains a lot of different characters, but probably not all of them."
chars = {char.lower() for char in text if char not in string.punctuation}
print(chars)
For dictionaries, which consist of key-value pairs, the syntax is only slightly more complicated. Here, you have to make sure that you link the correct key to the correct value using a colon, in the very first part of the comprehension. The following example will make this clearer:
In [ ]:
counts = {word:len(word) for word in words}
print(counts)
You've reached the end of Chapter 7! Ignore the code below, it's only here to make the page pretty:
In [1]:
from IPython.core.display import HTML
def css_styling():
styles = open("styles/custom.css", "r").read()
return HTML(styles)
css_styling()
Out[1]: